Knowledge Base Augmentation using Tabular Data
نویسندگان
چکیده
Large linked data repositories have been built by leveraging semi-structured data in Wikipedia (e.g., DBpedia) and through extracting information from natural language text (e.g., YAGO). However, the Web contains many other vast sources of linked data, such as structured HTML tables and spreadsheets. Often, the semantics in such tables is hidden, preventing one from extracting triples from them directly. This paper describes a probabilistic method that augments an existing knowledge base with facts from tabular data by leveraging a Web text corpus and natural language patterns associated with relations in the knowledge base. A preliminary evaluation shows high potential for this technique in augmenting linked data repositories.
منابع مشابه
Semantic Search in Tabular Structures
The Semantic Web search aims to overcome the bottleneck of finding relevant information using formal knowledge models, e.g. ontologies. The focus of this paper is to extend a typical search engine with semantic search over tabular structures. We categorize HTML documents into topics and genres. Using the TARTAR system, tabular structures in the documents are then automatically transformed into ...
متن کاملAbstractive Tabular Dataset Summarization via Knowledge Base Semantic Embeddings
is paper describes an abstractive summarization method1 for tabular datawhich employs a knowledge base semantic embedding to generate the summary. Assuming the dataset contains descriptive text in headers, columns and/or some augmenting metadata, the system employs the embedding to recommend a subject/type for each text segment. Recommendations are aggregated into a small collection of super t...
متن کاملImproving Open Data Usability through Semantics
With the success of Open Data a huge amount of tabular data become available that could potentially be mapped and linked into the Web of (Linked) Data. The use of semantic web technologies would then allow to explore related content and enhanced search functionalities across data portals. However, existing linkage and labeling approaches mainly rely on mappings of textual information to classes...
متن کاملDevelopment of ICD-10-TM ontology for a semi-automated morbidity coding system in Thailand.
OBJECTIVES The International Classification of Diseases and Related Health Problems, 10th Revision, Thai Modification (ICD-10-TM) ontology is a knowledge base created from the Thai modification of the World Health Organization International Classification of Diseases and Related Health Problems, 10th Revision. The objectives of this research were to develop the ICD-10-TM ontology as a knowledge...
متن کاملVisual Design and On-line Verification of Tabular Rule-Based
The paper is dedicated to presentation of a new approach to joint design and verification of rule-based systems. The principal idea is that verification should be performed on-line, incrementally, during system design. This allows for early detection and handling of knowledge base anomalies and inconsistencies. The proposed approach offers also an innovative visual tool for computer-aided desig...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014